Rank-aware, Approximate Query Processing on the Semantic Web
نویسنده
چکیده
The amount of data on the WWW that adheres to Semantic Web standards is rapidly increasing. Search over this huge Web data corpus frequently leads to queries having large result sets. So, in order to discover data elements, which satisfy a given information need, users must rely on ranking techniques to sort results according to their relevance. Unfortunately, processing queries with ranked results over a large data corpus is highly expensive in terms of computation time as well as computation resources. At the same time, applications often face information needs, which do not require complete and exact results. In this thesis, we face the problem of how to process queries over Web data in an approximate and rank-aware fashion. Aiming at this complex problem, we provide several novel contributions. More specifically, we introduce a rank-aware join operator for Web data. By means of this join operator, we can process queries with ranked results much more efficiently. That is, our rank-aware join operator focuses on computing the top-ranked query results first, while omitting the remainder of the results. Additionally, we enable systems to trade off result completeness and accuracy, in favor of query computation time. We provide two contributions for this approximate query processing. On the one hand, we present a novel pipeline of operations, which allows to incrementally compute query results. On the other hand, we introduce a new approximate rank-aware join operator. Our operator allows discarding such intermediate query results, which are not likely to lead to a final top-ranked result. Furthermore, we present a novel approach for selectivity estimation that is tailored towards the needs of Web data and typical Web queries. That is, our selectivity estimation approach allows the estimation of queries, which match structured as well as unstructured data elements in the Web of data. Such selectivity estimation is crucial for query optimization techniques, which can integrate our approximate/rank-aware join operators in physical query plans. Termin: Mittwoch, 09. April 2014, 15.45 Uhr Ort: Englerstraße 11, 76131 Karlsruhe Kollegiengebäude am Ehrenhof (Geb. 11.40), 2. OG, Raum 231 (Hinweise für Besucher: www.aifb.kit.edu/web/Kontakt Veranstalter: Institut AIFB, Forschungsgruppe Wissensmanagement Zu diesem Vortrag lädt das Institut für Angewandte Informatik und Formale Beschreibungsverfahren alle Interessierten herzlich ein. Andreas Oberweis, Hartmut Schmeck, Detlef Seese, Wolffried Stucky, Rudi Studer (Org.), Stefan Tai Institut für Angewandte Informatik und Formale Beschreibungsverfahren
منابع مشابه
Query expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملAn Effective Path-aware Approach for Keyword Search over Data Graphs
Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...
متن کاملEndowing Semantic Query Languages with Advanced Relaxation Capabilities
The problem of relaxing Semantic Web Database (SWDB) queries that return an empty/unsatisfactory set of answers, has been addressed by several works in the last years. Most of these studies have focused on developing new relaxation techniques or on optimizing the top-k query processing. However, only few works have been conducted to provide a fine and declarative control of query relaxation to ...
متن کاملDAW: Duplicate-AWare Federated Query Processing over the Web of Data
Over the last years the Web of Data has developed into a large compendium of interlinked data sets from multiple domains. Due to the decentralised architecture of this compendium, several of these datasets contain duplicated data. Yet, so far, only little attention has been paid to the effect of duplicated data on federated querying. This work presents DAW, a novel duplicate-aware approach to f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014